Support vector methods for survival analysis: a comparison between ranking and regression approaches

نویسندگان

  • Vanya Van Belle
  • Kristiaan Pelckmans
  • Sabine Van Huffel
  • Johan A. K. Suykens
چکیده

OBJECTIVE To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. METHODS The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. RESULTS We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. CONCLUSIONS This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods including only regression or both regression and ranking constraints on clinical data. On high dimensional data, the former model performs better. However, this approach does not have a theoretical link with standard statistical models for survival data. This link can be made by means of transformation models when ranking constraints are included.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support vector regression for prediction of gas reservoirs permeability

Reservoir permeability is a critical parameter for characterization of the hydrocarbon reservoirs. In fact, determination of permeability is a crucial task in reserve estimation, production and development. Traditional methods for permeability prediction are well log and core data analysis which are very expensive and time-consuming. Well log data is an alternative approach for prediction of pe...

متن کامل

Performance Evaluation of Support Vector Regression Models for Survival Analysis: A Simulation Study

Desirable features of support vector regression (SVR) models have led to researchers extending them to survival problems. In current paper we evaluate and compare performance of different SVR models and the Cox model using simulated and real data sets with different characteristics. Several SVR models are applied: 1) SVR with only regression constraints (standard SVR); 2) SVR with regression an...

متن کامل

Prediction of true critical temperature and pressure of binary hydrocarbon mixtures: A Comparison between the artificial neural networks and the support vector machine

Two main objectives have been considered in this paper: providing a good model to predict the critical temperature and pressure of binary hydrocarbon mixtures, and comparing the efficiency of the artificial neural network algorithms and the support vector regression as two commonly used soft computing methods. In order to have a fair comparison and to achieve the highest efficiency, a comprehen...

متن کامل

Machine learning algorithms in air quality modeling

Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...

متن کامل

Application of Support Vector Machine Regression for Predicting Critical Responses of Flexible Pavements

This paper aims to assess the application of Support Vector Machine (SVM) regression in order to analysis flexible pavements. To this end, 10000 Four-layer flexible pavement sections consisted of asphalt concrete layer, granular base layer, granular subbase layer, and subgrade soil were analyzed under the effect of standard axle loading using multi-layered elastic theory and pavement critical r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Artificial intelligence in medicine

دوره 53 2  شماره 

صفحات  -

تاریخ انتشار 2011